feat(import): add script tool for multiple hbase snapshot imports)#4606
feat(import): add script tool for multiple hbase snapshot imports)#4606tianlei2 wants to merge 3 commits into
Conversation
3086725 to
958c187
Compare
40e4a02 to
af3976d
Compare
af3976d to
1c83a18
Compare
| --serviceAccount=${SERVICE_ACCOUNT} \ | ||
| --usePublicIps=false \ | ||
| --enableSnappy=true \ | ||
| --skipRestoreStep=${SKIP_RESTORE} \ |
There was a problem hiding this comment.
This is good, but how are we passing the restorePath?
There was a problem hiding this comment.
I feel like we should have a custom restore path and the script (idempoent by adding timestamp etc) and use it for restore the path and pass it as the restorepath in every job.
Also, with this model, who cleans up the restore path? is there a way to trigger a cleanup at the end of the script? or we have a tool that can be used? We can also say its a manual step. but then this script should output something to the tune of "the snapshot was imported, please cleanup $RESTORE_PATH once validation succeeds."
There was a problem hiding this comment.
Added some output for manual cleanups and passing restore path to every job.
|
|
||
| Example for manual parallel execution: | ||
| ```bash | ||
| ./run-snapshot-import.sh 0 3 & # Run this first! |
There was a problem hiding this comment.
In this model, how does the shard1. know htat it has to wait for restore to finish from shard 0? won't it run and fail becuase restore path is empty?
There was a problem hiding this comment.
updating this to run a restore-only command first
9622477 to
f9ce742
Compare
e2e88b7 to
3b1044b
Compare
…twork argument handling in snapshot import script
70ae9d4 to
1e2551c
Compare
| <exclude>META-INF/*.RSA</exclude> | ||
| <exclude>META-INF/**/pom.properties</exclude> | ||
| <exclude>META-INF/**/pom.xml</exclude> | ||
| <exclude>META-INF/MANIFEST.MF</exclude> |
There was a problem hiding this comment.
Removing the MANIFEST.MF exclusion preserves the Multi-Release: true header required to load JDK 21-compatible dependency classes and prevent startup crashes, while remaining fully backward-compatible and safe for JDK 8.
| @@ -0,0 +1,188 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
nit, maybe we put this file under a bin/ or scripts/ directory?
| # ------------------------------------------------------------------------------ | ||
| # Usage | ||
| # ------------------------------------------------------------------------------ | ||
| # Usage: ./run-snapshot-import.sh <start_shard> <end_shard> |
There was a problem hiding this comment.
Both start and end shard are inclusive, right? Probably good to add a note about it?
| END_SHARD=$2 | ||
|
|
||
| # Configurations | ||
| JAR_PATH="target/bigtable-beam-import-2.18.2-SNAPSHOT-shaded.jar" |
There was a problem hiding this comment.
Nit: we shouldn't have the jar using a fixed snapshot version. Also, we should probably build the jar first?
| # If running the launcher on a newer, unsupported JDK version (e.g. JDK 25+), | ||
| # you can set JVM_OPTS to "-Dnet.bytebuddy.experimental=true" to bypass | ||
| # ByteBuddy class generation crashes during pipeline construction. | ||
| JVM_OPTS="" |
There was a problem hiding this comment.
But will this override the settings described in the comment?
| --restorePath=${RESTORE_DIR} \ | ||
| --jobName="restore-job" \ | ||
| "${NETWORK_ARGS[@]}" | ||
| echo "✅ Restore completed." |
There was a problem hiding this comment.
Does the java job has awaitPipelineFinish in the end? Cause otherwise it'll just submit the job and we need to track progress on the dataflow dashboard.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
b/429250716
This is the second PR with the shell tool script. This a child of #4600, so the only file that needs to be reviewed is the script itself (bigtable-dataflow-parent/bigtable-beam-import/run-snapshot-import.sh). If this looks ok, I will merge this with the parent PR.
The script is originally from https://docs.google.com/document/d/1T7BQF-AYY8xGbdbhwkV7zPA11A_Fh5SRjA_7nkzeTos/edit?tab=t.twzz7rrh6jz7#heading=h.m7p3xy7m76u2